# All Libraries
library(lubridate) # Date
library(binaryLogic) # Binary Encoding
# Visualizations
library(dplyr)
library(ggplot2)
library("tm")
library("SnowballC")
library("wordcloud")
library("RColorBrewer")
library(stringr)
# Correlation Analysis
library("PerformanceAnalytics")
library(Hmisc)
The dataset was downloaded from the following kaggle link: https://www.kaggle.com/arindam235/startup-investments-crunchbase
The dataset related to startup investments in 116 countries has been collected from the CrunchBase database. We considered only the startup companies based in USA. The factors we considered - the type of market, the state, the region the startup belongs to, total funding (in USD), different types of funding – seed, angel, crowdfunding etc., founded date, last funding date, operating status.
name: Name of the startup company
category_list: Category the startup belongs to
market: Type of market the startup belongs to
funding_total_usd: Total funding the startup received
status: Status of the startup – acquired, operational, closed
state_code: State in which the startup was founded in
region: Region in which the startup was founded in
city: City in which the startup was founded in
funding_rounds: Number of funding rounds the startup went through
founded_at: Day on which the startup was founded
founded_month: Month in which the startup was founded
founded_quarter: Quarter in which the startup was founded – Q1, Q2, Q3, Q4
founded_year: Year in which the startup was founded
first_funding_day: Day on which the startup started receiving funding
first_funding_month: Month in which the startup started receiving funding
first_funding_year: Year in which the startup started receiving funding
last_funding_day: Day on which the startup stopped receiving funding
last_funding_month: Month in which the startup stopped receiving funding
last_funding_year: Year in which the startup stopped receiving funding
seed: Seed rounds are among the first rounds of funding a company will receive, generally while the company is young and working to gain traction. A seed round typically comes after an angel round (if applicable) and before a company’s Series A round.
venture: Venture funding refers to an investment that comes from a venture capital firm and describes Series A, Series B, and later rounds. This funding type is used for any funding round that is clearly a venture round but where the series has not been specified.
equity_crowdfunding: Equity crowdfunding platforms allow individual users to invest in companies in exchange for equity. Typically, on these platforms the investors invest small amounts of money, though syndicates are formed to allow an individual to take a lead on evaluating an investment and pooling funding from a group of individual investors.
undisclosed: Undisclosed amount on the last funding date.
convertible_note: A convertible note is an ‘in-between’ round funding to help companies hold over until they want to raise their next round of funding. When they raise the next round, this note ‘converts’ with a discount at the price of the new round. You will typically see convertible notes after a company raises, for example, a Series A round but does not yet want to raise a Series B round.
debt_financing: In a debt round, an investor lends money to a company, and the company promises to repay the debt with added interest.
angel: An angel round is typically a small round designed to get a new company off the ground. Investors in an angel round include individual angel investors, angel investor groups, friends, and family.
grant: A grant is when a company, investor, or government agency provides capital to a company without taking an equity stake in the company.
private_equity: A private equity round is led by a private equity firm or a hedge fund and is a late stage round. It is a less risky investment because the company is more firmly established, and the rounds are typically upwards of $50M.
post_ipo_equity: A post-IPO equity round takes place when firms invest in a company after the company has already gone public.
post_ipo_debt: A post-IPO debt round takes place when firms loan a company money after the company has already gone public. Similar to debt financing, a company will promise to repay the principal as well as added interest on the debt.
secondary_market: A secondary market transaction is a fundraising event in which one investor purchases shares of stock in a company from other, existing shareholders rather than from the company directly. These transactions often occur when a private company becomes highly valuable and early stage investors or employees want to earn a profit on their investment, and these transactions are rarely announced or publicized.
product_crowdfunding: In a product crowdfunding round, a company will provide its product, which is often still in development, in exchange for capital. This kind of round is also typically completed on a funding platform.
round_A: Round A is a funding round for earlier stage companies and range on average between $1M–$30M.
round_B: Round B is a funding round for earlier stage companies and range on average between $1M–$30M.
round_C: Round C is a funding round for later stage and more established companies. These rounds are usually $10M+ and are often much larger.
round_D: Round D is a funding round for later stage and more established companies. These rounds are usually $10M+ and are often much larger.
round_E: Round E is a funding round for later stage and more established companies. These rounds are usually $10M+ and are often much larger.
round_F: Round F is a funding round for later stage and more established companies. These rounds are usually $10M+ and are often much larger.
round_G: Round G is a funding round for later stage and more established companies. These rounds are usually $10M+ and are often much larger.
round_H: Round H is a funding round for later stage and more established companies. These rounds are usually $10M+ and are often much larger.
post_success: Indicates if the startup was successful or not successful based on the rules mentioned above.
dat <- read.csv("../Dataset/US_StartUps_Investments_Data.csv", na.strings = c("", "NA"), stringsAsFactors = FALSE)
head(dat)
On initial observation, the dataset downloaded from Kaggle had a lot of missing data. So, we removed rows which had missing data in one or more columns.
# Replacing commas in the column - funding_total_usd
dat$funding_total_usd <- as.numeric(gsub(",","",dat$funding_total_usd))
The values in the column “funding_total_usd” had commas. We removed commas for the values.
# Remove rows with NAs
dat <- dat[complete.cases(dat),]
# Remove trailing spaces in column - market
dat$market <- trimws(dat$market, which = c("both"))
The market column had left and right trailing spaces for each keyword. So, we trimmed it using trimws() function.
# Remove country_code, homepage_URL, X and permalink
dat$homepage_url <- NULL
dat$X <- NULL
dat$permalink <- NULL
dat$country_code <- NULL
We didn’t require homepage URL of the startup company, X which is the ID, permalink - permanent link in the crunchbase, country code. We are considering only startups in United States of America (USA).
# Strip values in columns
# Month
dat$founded_month = substring(dat$founded_month,6)
dat$founded_month = as.numeric(dat$founded_month)
# Quarter
dat$founded_quarter = substring(dat$founded_quarter,6)
# YY/MM/DD. We only need the day.
dat$founded_at = substring(dat$founded_at,9)
dat$founded_at = as.numeric(dat$founded_at)
We observed that the values in the founded_month column were in Year-Month. Since we needed only the month we removed the year from those values. It was a similar case for founded_quarter column.
founded_at is the day in the YY/MM/DD format. So we removed year and month information from the values to just get the day(dd) information.
# Date DataType
dat$first_funding_at <- as.Date(dat$first_funding_at)
dat$last_funding_at <- as.Date(dat$last_funding_at)
# Separate date into three columns
dat$first_funding_day <- day(ymd(dat$first_funding_at))
dat$first_funding_month <- month(ymd(dat$first_funding_at))
dat$first_funding_year <- year(ymd(dat$first_funding_at))
dat$last_funding_day <- day(ymd(dat$last_funding_at))
dat$last_funding_month <- month(ymd(dat$last_funding_at))
dat$last_funding_year <- year(ymd(dat$last_funding_at))
# Remove Funding Dates
dat$first_funding_at <- NULL
dat$last_funding_at <- NULL
first_funding_at and last_funding_at are in YY/MM/DD format. We split the dates into three columns - first_funding_day (DD), first_funding_month (MM),first_funding_year (YY).
# Create new column: Success or not Success
# Acquired, post_ipo_equity or post_ipo_equity and Acquired
dat$post_success <- ifelse(dat$status=="acquired", 1,
ifelse(dat$post_ipo_equity > 0, 1,
ifelse((dat$status=="acquired") && (dat$post_ipo_equity > 0), 1,
0 ))) # all other values map to 0
dat$post_success <- factor(dat$post_success)
Both an IPO (Initial Public Offering) and a process of M&A (Mergers & Acquisitions) are the critical events that classify a start-up as successful.
We created the “post_success” column which indicates if the startup was successful or not successful based on these rules – if the status of the company is “acquired” or if the post IPO equity is greater than 0 or if the status of the company is “acquired” and the post IPO equity is greater than 0.
str(dat)
## 'data.frame': 18280 obs. of 41 variables:
## $ name : chr "#waywire" "1-800-DOCTORS" "10-20 Media" "1000 Corks" ...
## $ category_list : chr "|Entertainment|Politics|Social Media|News|" "|Health and Wellness|" "|E-Commerce|" "|Search|" ...
## $ market : chr "News" "Health and Wellness" "E-Commerce" "Search" ...
## $ funding_total_usd : num 1750000 1750000 2050000 40000 2535000 ...
## $ status : chr "acquired" "operating" "operating" "operating" ...
## $ state_code : chr "NY" "NJ" "MD" "OR" ...
## $ region : chr "New York City" "Newark" "Baltimore" "Portland, Oregon" ...
## $ city : chr "New York" "Iselin" "Woodbine" "Lake Oswego" ...
## $ funding_rounds : num 1 1 4 1 2 6 1 2 1 1 ...
## $ founded_at : num 1 1 1 1 1 1 4 16 1 1 ...
## $ founded_month : num 6 1 1 1 7 1 7 9 1 4 ...
## $ founded_quarter : chr "Q2" "Q1" "Q1" "Q1" ...
## $ founded_year : num 2012 1984 2001 2008 2010 ...
## $ seed : num 1750000 0 0 40000 15000 0 420000 750000 0 50000 ...
## $ venture : num 0 0 0 0 2520000 ...
## $ equity_crowdfunding : num 0 0 0 0 0 0 0 0 0 0 ...
## $ undisclosed : num 0 0 0 0 0 0 0 0 0 0 ...
## $ convertible_note : num 0 1750000 0 0 0 0 0 0 0 0 ...
## $ debt_financing : num 0 0 2050000 0 0 ...
## $ angel : num 0 0 0 0 0 0 0 0 0 0 ...
## $ grant : num 0 0 0 0 0 0 0 0 0 0 ...
## $ private_equity : num 0 0 0 0 0 0 0 0 0 0 ...
## $ post_ipo_equity : num 0 0 0 0 0 0 0 0 0 0 ...
## $ post_ipo_debt : num 0 0 0 0 0 0 0 0 0 0 ...
## $ secondary_market : num 0 0 0 0 0 0 0 0 0 0 ...
## $ product_crowdfunding: num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_A : num 0 0 0 0 2520000 0 0 0 0 0 ...
## $ round_B : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_C : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_D : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_E : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_F : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_G : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_H : num 0 0 0 0 0 0 0 0 0 0 ...
## $ first_funding_day : int 30 2 18 23 1 14 26 2 8 1 ...
## $ first_funding_month : num 6 3 6 8 1 10 11 11 3 4 ...
## $ first_funding_year : num 2012 2011 2009 2011 2010 ...
## $ last_funding_day : int 30 2 28 23 16 19 26 30 8 1 ...
## $ last_funding_month : num 6 3 12 8 2 9 11 11 3 4 ...
## $ last_funding_year : num 2012 2011 2011 2011 2011 ...
## $ post_success : Factor w/ 2 levels "0","1": 2 1 1 1 2 1 1 2 1 1 ...
# Saving file
write.csv(dat,'Data_Cleansed.csv')
saveRDS(dat, file = "Data_Cleansed.rds")
There are more than 32 markets, regions and states. We binary encoded the values in market, region, state_code. Then, digits from that binary string are split into separate columns. This is useful for machine learning. Better encoding of categorical data can mean better model performance.
# Function for Binary Encoding
encode_binary <- function(x, order = unique(x), name = "v_") {
x <- as.numeric(factor(x, levels = order, exclude = NULL))
x2 <- as.binary(x)
maxlen <- max(sapply(x2, length))
x2 <- lapply(x2, function(y) {
l <- length(y)
if (l < maxlen) {
y <- c(rep(0, (maxlen - l)), y)
}
y
})
d <- as.data.frame(t(as.data.frame(x2)))
rownames(d) <- NULL
colnames(d) <- paste0(name, 1:maxlen)
d
}
# dat -> Visualizations
# filter_data -> ML
new_df <- cbind(dat, encode_binary(dat[["market"]], name = "market_"))
new_df <- cbind(new_df, encode_binary(dat[["region"]], name = "region_"))
new_df <- cbind(new_df, encode_binary(dat[["state_code"]], name = "state_code_"))
saveRDS(new_df, file = "Data_Cleansed_Encoded.rds")
We selected data from year 1990 until 2014. We converted all the categorical variables to factors. We also converted the data types from “num” to “logical” (boolean) for the binary encoded columns.
# Filter the data and Factorize it!
filter_data <- new_df[ which(new_df$founded_year >= 1990), ]
# Datatypes
# Categorical Variables: name, category_list, market, status, state_code, region, city, founded_at, founded_month, founded_quarter,
# founded_year, first_funding_day, first_funding_month, first_funding_year, last_funding_day, last_funding_month,
# last_funding_year
filter_data$name <- NULL
filter_data$category_list<- NULL
filter_data$market<- NULL
filter_data$status<-factor(filter_data$status)
filter_data$state_code<- NULL
filter_data$region<- NULL
filter_data$city<- NULL
filter_data$founded_at<-factor(filter_data$founded_at)
filter_data$founded_month<-factor(filter_data$founded_month)
filter_data$founded_quarter<-factor(filter_data$founded_quarter)
filter_data$founded_year<-factor(filter_data$founded_year)
filter_data$first_funding_day<-factor(filter_data$first_funding_day)
filter_data$first_funding_month<-factor(filter_data$first_funding_month)
filter_data$first_funding_year<-factor(filter_data$first_funding_year)
filter_data$last_funding_day<-factor(filter_data$last_funding_day)
filter_data$last_funding_month<-factor(filter_data$last_funding_month)
filter_data$last_funding_year<-factor(filter_data$last_funding_year)
filter_data$market_1 <- as.logical(filter_data$market_1)
filter_data$market_2 <- as.logical(filter_data$market_2)
filter_data$market_3 <- as.logical(filter_data$market_3)
filter_data$market_4 <- as.logical(filter_data$market_4)
filter_data$market_5 <- as.logical(filter_data$market_5)
filter_data$market_6 <- as.logical(filter_data$market_6)
filter_data$market_7 <- as.logical(filter_data$market_7)
filter_data$market_8 <- as.logical(filter_data$market_8)
filter_data$market_9 <- as.logical(filter_data$market_9)
filter_data$market_10 <- as.logical(filter_data$market_10)
filter_data$region_1 <- as.logical(filter_data$region_1)
filter_data$region_2 <- as.logical(filter_data$region_2)
filter_data$region_3 <- as.logical(filter_data$region_3)
filter_data$region_4 <- as.logical(filter_data$region_4)
filter_data$region_5 <- as.logical(filter_data$region_5)
filter_data$region_6 <- as.logical(filter_data$region_6)
filter_data$region_7 <- as.logical(filter_data$region_7)
filter_data$region_8 <- as.logical(filter_data$region_8)
filter_data$state_code_1 <- as.logical(filter_data$state_code_1)
filter_data$state_code_2 <- as.logical(filter_data$state_code_2)
filter_data$state_code_3 <- as.logical(filter_data$state_code_3)
filter_data$state_code_4 <- as.logical(filter_data$state_code_4)
filter_data$state_code_5 <- as.logical(filter_data$state_code_5)
filter_data$state_code_6 <- as.logical(filter_data$state_code_6)
saveRDS(filter_data, file="Data_CE_Filtered.rds")
str(filter_data)
## 'data.frame': 17784 obs. of 59 variables:
## $ funding_total_usd : num 1750000 2050000 40000 2535000 4962651 ...
## $ status : Factor w/ 3 levels "acquired","closed",..: 1 3 3 1 3 3 1 3 2 3 ...
## $ funding_rounds : num 1 4 1 2 6 1 2 1 1 1 ...
## $ founded_at : Factor w/ 31 levels "1","2","3","4",..: 1 1 1 1 1 4 16 1 1 1 ...
## $ founded_month : Factor w/ 12 levels "1","2","3","4",..: 6 1 1 7 1 7 9 1 4 1 ...
## $ founded_quarter : Factor w/ 4 levels "Q1","Q2","Q3",..: 2 1 1 3 1 3 3 1 2 1 ...
## $ founded_year : Factor w/ 25 levels "1990","1991",..: 23 12 19 21 19 25 22 11 20 23 ...
## $ seed : num 1750000 0 40000 15000 0 420000 750000 0 50000 0 ...
## $ venture : num 0 0 0 2520000 3814772 ...
## $ equity_crowdfunding : num 0 0 0 0 0 0 0 0 0 0 ...
## $ undisclosed : num 0 0 0 0 0 0 0 0 0 0 ...
## $ convertible_note : num 0 0 0 0 0 0 0 0 0 0 ...
## $ debt_financing : num 0 2050000 0 0 1147879 ...
## $ angel : num 0 0 0 0 0 0 0 0 0 0 ...
## $ grant : num 0 0 0 0 0 0 0 0 0 0 ...
## $ private_equity : num 0 0 0 0 0 0 0 0 0 0 ...
## $ post_ipo_equity : num 0 0 0 0 0 0 0 0 0 0 ...
## $ post_ipo_debt : num 0 0 0 0 0 0 0 0 0 0 ...
## $ secondary_market : num 0 0 0 0 0 0 0 0 0 0 ...
## $ product_crowdfunding: num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_A : num 0 0 0 2520000 0 0 0 0 0 0 ...
## $ round_B : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_C : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_D : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_E : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_F : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_G : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_H : num 0 0 0 0 0 0 0 0 0 0 ...
## $ first_funding_day : Factor w/ 31 levels "1","2","3","4",..: 30 18 23 1 14 26 2 8 1 7 ...
## $ first_funding_month : Factor w/ 12 levels "1","2","3","4",..: 6 6 8 1 10 11 11 3 4 11 ...
## $ first_funding_year : Factor w/ 24 levels "11","1990","1993",..: 22 19 21 20 19 24 21 20 19 22 ...
## $ last_funding_day : Factor w/ 31 levels "1","2","3","4",..: 30 28 23 16 19 26 30 8 1 7 ...
## $ last_funding_month : Factor w/ 12 levels "1","2","3","4",..: 6 12 8 2 9 11 11 3 4 11 ...
## $ last_funding_year : Factor w/ 22 levels "1990","1993",..: 20 19 19 19 22 22 19 18 17 20 ...
## $ post_success : Factor w/ 2 levels "0","1": 2 1 1 2 1 1 2 1 1 1 ...
## $ market_1 : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ market_2 : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ market_3 : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ market_4 : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ market_5 : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ market_6 : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ market_7 : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ market_8 : logi FALSE FALSE TRUE TRUE TRUE TRUE ...
## $ market_9 : logi FALSE TRUE FALSE FALSE FALSE TRUE ...
## $ market_10 : logi TRUE TRUE FALSE TRUE TRUE FALSE ...
## $ region_1 : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ region_2 : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ region_3 : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ region_4 : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ region_5 : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ region_6 : logi FALSE FALSE TRUE TRUE TRUE TRUE ...
## $ region_7 : logi FALSE TRUE FALSE FALSE TRUE TRUE ...
## $ region_8 : logi TRUE TRUE FALSE TRUE FALSE TRUE ...
## $ state_code_1 : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ state_code_2 : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ state_code_3 : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ state_code_4 : logi FALSE FALSE TRUE TRUE TRUE TRUE ...
## $ state_code_5 : logi FALSE TRUE FALSE FALSE TRUE TRUE ...
## $ state_code_6 : logi TRUE TRUE FALSE TRUE FALSE TRUE ...
Finally, we have no null values in the dataset! We changed the data type of each column for further analysis.
# Founding year
viz_data <- dat
viz_data$founded_year <- factor(viz_data$founded_year)
viz_data %>% count(founded_year)%>% arrange(-n) %>% head(25) %>%
ggplot(aes(reorder(founded_year, n), n)) + geom_col(aes(fill=founded_year)) + coord_flip() +
theme(legend.position="none") +
ggtitle("Year Founded") + xlab("Year") + ylab("Count") +
geom_vline(aes(xintercept = founded_year), data = viz_data %>% filter(name == "Facebook")) +
geom_text(aes(x = founded_year, y=2000, label = "Facebook"), data = viz_data %>% filter(name == "Facebook"), size=4, vjust=-0.3, hjust=-0.3) +
geom_vline(aes(xintercept = founded_year), data = viz_data %>% filter(name == "Twitter")) +
geom_text(aes(x = founded_year, y=2000, label = "Twitter"), data = viz_data %>% filter(name == "Twitter"), size=4, vjust=-0.3, hjust=-0.3) +
geom_vline(aes(xintercept = founded_year), data = viz_data %>% filter(name == "Google")) +
geom_text(aes(x = founded_year, y=2000, label = "Google"), data = viz_data %>% filter(name == "Google"), size=4, vjust=-0.3, hjust=-0.3) +
geom_vline(aes(xintercept = founded_year), data = viz_data %>% filter(name == "YouTube")) +
geom_text(aes(x = founded_year, y=2000, label = "YouTube"), data = viz_data %>% filter(name == "YouTube"), size=4, vjust=-0.3, hjust=-0.3) +
geom_vline(aes(xintercept = founded_year), data = viz_data %>% filter(name == "Instagram")) +
geom_text(aes(x = founded_year, y=2000, label = "Instagram"), data = viz_data %>% filter(name == "Instagram"), size=4, vjust=-0.3, hjust=-0.3) +
geom_vline(aes(xintercept = founded_year), data = viz_data %>% filter(name == "Uber")) +
geom_text(aes(x = founded_year, y=2000, label = "Uber,"), data = viz_data %>% filter(name == "Uber"), size=4, vjust=-0.3, hjust=0.8) +
geom_vline(aes(xintercept = founded_year), data = viz_data %>% filter(name == "WhatsApp")) +
geom_text(aes(x = founded_year, y=2000, label = "WhatsApp"), data = viz_data %>% filter(name == "WhatsApp"), size=4, vjust=-0.3, hjust=-0.3)
From the graph, we can observe that many successful companies like Facebook, Twitter, Uber, YouTube, WhatsApp intially started as startups after the year 2000. Also, the number of startup companies started in a year increased significantly from the year 2000.
# State Code
dat %>% count(state_code)%>% arrange(-n) %>% head(20) %>%
ggplot(aes(reorder(state_code, n), n)) + geom_col(aes(fill=state_code)) + coord_flip() +
theme(legend.position="none") +
ggtitle("Number of startups in States") + xlab("States") + ylab("Count")
From the graph, we can observe that California (CA) state has the highest number of startups and the New York (NY) state is the second. Also, states like Massachusetts (MA), Texas (TX) and Washington (WA) have more number of startups.
# Top 10 market leaders
dat %>% filter(market!="") %>% count(market)%>% arrange(-n) %>% head(10) %>%
ggplot(aes(reorder(market, n), n)) + geom_col(aes(fill=market)) + coord_flip() +
theme(legend.position="none") +
ggtitle("Top 10 Market Leaders") + xlab("Markets") + ylab("Count")
We can observe that many startups are in the field of software, Biotechnology, Health Care, E-Commerce.
# Markets
dat %>% count(market)%>% arrange(-n) %>% head(30) %>%
ggplot(aes(reorder(market, n), n)) + geom_col(aes(fill=market)) + coord_flip() +
theme(legend.position="none") +
ggtitle("Number of Startups in Different Markets") + xlab("Markets") + ylab("Count")
We can observe that many startups are in the field of software, Biotechnology, Health Care, E-Commerce.
# Regions
dat %>% count(region)%>% arrange(-n) %>% head(20) %>%
ggplot(aes(reorder(region, n), n)) + geom_col(aes(fill=region)) + coord_flip() +
theme(legend.position="none") +
ggtitle("Number of Startups in Region") + xlab("Regions") + ylab("Count")
Regions like San Francisco Bay Area, New York City, Boston, Los Angeles and Washington DC have many number of startups.
# Cities
dat %>% count(city)%>% arrange(-n) %>% head(30) %>%
ggplot(aes(reorder(city, n), n)) + geom_col(aes(fill=city)) + coord_flip() +
theme(legend.position="none") +
ggtitle("Number of Startups in Cities") + xlab("Cities") + ylab("Count")
San Francisco and New York have many more number of startups compared to other cities.
# No. of Startups vs Status
dat %>% count(status)%>% arrange(-n) %>%
ggplot(aes(reorder(status, n), n)) + geom_col(aes(fill=status)) +
theme(legend.position="none") +
ggtitle("No. of Startups vs Status") + xlab("Status") + ylab("Count")
We can observe that many startups established are still operating as of 2014. Also, many have been acquired and only few are closed which might operate in future.
docs <- apply(dat['category_list'], 2, function(y) str_sub(y, 2, -2))
docs <- apply(docs, 2, function(y) unlist(strsplit(y, "\\|")))
docs <- Corpus(VectorSource(docs))
# Convert the text to lower case
docs <- tm_map(docs, content_transformer(tolower))
## Warning in tm_map.SimpleCorpus(docs, content_transformer(tolower)):
## transformation drops documents
# Remove numbers
docs <- tm_map(docs, removeNumbers)
## Warning in tm_map.SimpleCorpus(docs, removeNumbers): transformation drops
## documents
# Remove english common stopwords
docs <- tm_map(docs, removeWords, stopwords("english"))
## Warning in tm_map.SimpleCorpus(docs, removeWords, stopwords("english")):
## transformation drops documents
# Remove your own stop word
# specify your stopwords as a character vector
#docs <- tm_map(docs, removeWords, c("blabla1", "blabla2"))
# Remove punctuations
docs <- tm_map(docs, removePunctuation)
## Warning in tm_map.SimpleCorpus(docs, removePunctuation): transformation drops
## documents
# Eliminate extra white spaces
#docs <- tm_map(docs, stripWhitespace)
# Text stemming
# docs <- tm_map(docs, stemDocument)
dtm <- TermDocumentMatrix(docs)
m <- as.matrix(dtm)
v <- sort(rowSums(m),decreasing=TRUE)
d <- data.frame(word = names(v),freq=v)
head(d, 10)
set.seed(1234)
wordcloud(words = d$word, freq = d$freq, min.freq = 1,
max.words=200, random.order=FALSE, rot.per=0.35,
colors=brewer.pal(8, "Dark2"))
#Explore frequent terms and their associations
#You can have a look at the frequent terms in the term-document matrix as follow. In the example below we want to find words that occur at least four times :
findFreqTerms(dtm, lowfreq = 4)
## [1] "entertainment" "politics" "media"
## [4] "social" "news" "health"
## [7] "wellness" "ecommerce" "search"
## [10] "curated" "web" "care"
## [13] "information" "technology" "analytics"
## [16] "software" "biotechnology" "devices"
## [19] "medical" "pharmaceuticals" "personalization"
## [22] "manufacturing" "graph" "interest"
## [25] "advertising" "sports" "real"
## [28] "time" "video" "games"
## [31] "shopping" "cosmetics" "personal"
## [34] "lifestyle" "fashion" "services"
## [37] "mobile" "big" "data"
## [40] "enterprise" "commerce" "art"
## [43] "non" "profit" "fundraising"
## [46] "clean" "finance" "marketplaces"
## [49] "cloud" "computing" "development"
## [52] "databases" "infrastructure" "corporate"
## [55] "local" "mmo" "internet"
## [58] "network" "consumers" "women"
## [61] "education" "apps" "design"
## [64] "photography" "creative" "consumer"
## [67] "electronics" "dental" "centers"
## [70] "bitcoin" "ipad" "electronic"
## [73] "records" "customer" "service"
## [76] "visualization" "chain" "management"
## [79] "supply" "drones" "beauty"
## [82] "retail" "travel" "sms"
## [85] "messaging" "hosting" "security"
## [88] "flash" "storage" "learning"
## [91] "machine" "capital" "venture"
## [94] "peerpeer" "networking" "entrepreneur"
## [97] "crowdfunding" "commercial" "estate"
## [100] "automotive" "energy" "wireless"
## [103] "crm" "hardware" "kids"
## [106] "loyalty" "programs" "distribution"
## [109] "digital" "gaming" "online"
## [112] "marketing" "browser" "extensions"
## [115] "predictive" "startups" "businesses"
## [118] "medium" "small" "shipping"
## [121] "semantic" "seo" "communities"
## [124] "brand" "consulting" "music"
## [127] "telecommunications" "crowdsourcing" "business"
## [130] "saas" "training" "presentations"
## [133] "enterprises" "email" "payments"
## [136] "productivity" "point" "sale"
## [139] "android" "ios" "iphone"
## [142] "communications" "virtualization" "optimization"
## [145] "batteries" "human" "resources"
## [148] "platforms" "teachers" "voip"
## [151] "therapeutics" "psychology" "privacy"
## [154] "accounting" "automation" "project"
## [157] "solar" "identity" "events"
## [160] "contact" "gamification" "generation"
## [163] "lead" "portals" "transportation"
## [166] "logistics" "detection" "fraud"
## [169] "cards" "credit" "sales"
## [172] "nonprofits" "recruiting" "career"
## [175] "semiconductors" "healthcare" "market"
## [178] "radio" "public" "relations"
## [181] "certification" "test" "physicians"
## [184] "incentives" "things" "cms"
## [187] "content" "collectibles" "toys"
## [190] "outdoors" "coupons" "comparison"
## [193] "financial" "targeting" "developer"
## [196] "tools" "planning" "resource"
## [199] "nanotechnology" "auctions" "television"
## [202] "displays" "recommendations" "reviews"
## [205] "collaboration" "universities" "publishing"
## [208] "fitness" "networks" "trusted"
## [211] "craigslist" "killers" "rental"
## [214] "buying" "file" "sharing"
## [217] "based" "location" "navigation"
## [220] "classifieds" "applications" "twitter"
## [223] "facebook" "dodmilitary" "weddings"
## [226] "tracking" "discovery" "archiving"
## [229] "image" "recognition" "asset"
## [232] "intellectual" "rim" "cars"
## [235] "blogging" "monetization" "outsourcing"
## [238] "graphics" "demand" "streaming"
## [241] "lighting" "performance" "intelligence"
## [244] "recipes" "adventure" "insurance"
## [247] "legal" "colleges" "gps"
## [250] "grid" "smart" "hospitality"
## [253] "rfid" "product" "alumni"
## [256] "risk" "card" "gift"
## [259] "synchronization" "ipod" "touch"
## [262] "agriculture" "industry" "language"
## [265] "natural" "processing" "office"
## [268] "space" "brokers" "sustainability"
## [271] "bookmarking" "hospitals" "hotels"
## [274] "syndication" "mobility" "signage"
## [277] "sensors" "aerospace" "chat"
## [280] "groceries" "banking" "text"
## [283] "artificial" "apis" "computer"
## [286] "vision" "emergencyhealth" "charter"
## [289] "schools" "home" "environmental"
## [292] "innovation" "goods" "price"
## [295] "delivery" "rights" "surveys"
## [298] "mining" "employment" "open"
## [301] "source" "geospatial" "simulation"
## [304] "exchanges" "stock" "independent"
## [307] "labels" "integration" "testing"
## [310] "broadcasting" "citizens" "senior"
## [313] "pets" "maps" "exercise"
## [316] "chemicals" "energies" "renewable"
## [319] "biometrics" "water" "mhealth"
## [322] "audio" "virtual" "worlds"
## [325] "beer" "craft" "robotics"
## [328] "dating" "opinions" "private"
## [331] "collaborative" "consumption" "discounts"
## [334] "subscription" "benefits" "employer"
## [337] "paas" "ticketing" "creators"
## [340] "app" "stores" "iaas"
## [343] "application" "monitoring" "restaurants"
## [346] "life" "sciences" "professional"
## [349] "codes" "markets" "cyber"
## [352] "efficiency" "industrial" "electrical"
## [355] "students" "freelancers" "editing"
## [358] "photo" "auto" "green"
## [361] "architecture" "angels" "research"
## [364] "experience" "user" "scheduling"
## [367] "billing" "engineering" "trading"
## [370] "garden" "linux" "meeting"
## [373] "artists" "globally" "task"
## [376] "gas" "oil" "diagnostics"
## [379] "computers" "campuses" "college"
## [382] "knowledge" "investors" "ediscovery"
## [385] "augmented" "reality" "tablets"
## [388] "parenting" "domination" "world"
## [391] "reputation" "bridging" "offline"
## [394] "systems" "writers" "providers"
## [397] "forums" "limousines" "clinical"
## [400] "trials" "industries" "conferencing"
## [403] "document" "homeland" "unifed"
## [406] "direct" "wholesale" "concerts"
## [409] "nightlife" "foods" "specialty"
## [412] "organic" "game" "gambling"
## [415] "jewelry" "babies" "shoes"
## [418] "currency" "incubators" "spas"
## [421] "spirits" "wine" "cooking"
## [424] "support" "field" "tech"
## [427] "sporting" "nutrition" "coworking"
## [430] "google" "doctors" "investment"
## [433] "nightclubs" "group" "printing"
## [436] "celebrity" "charity" "mechanics"
## [439] "interface" "cybersecurity" "commodities"
## [442] "funds" "hedge" "equipment"
## [445] "semiconductor" "nfc" "journalism"
## [448] "cad" "firms" "coffee"
## [451] "recycling" "diabetes" "textbooks"
## [454] "construction" "promotional" "tourism"
## [457] "textiles" "renovation" "reservations"
## [460] "ebooks" "leisure" "rentals"
## [463] "vacation" "gadget" "event"
## [466] "neuroscience" "assessment" "skill"
## [469] "browsers" "customization" "mass"
## [472] "building" "food" "fleet"
## [475] "franchises" "lending" "property"
## [478] "staffing" "utilities" "measurement"
## [481] "workforces" "high" "money"
## [484] "transfer" "tutoring" "realtors"
## [487] "transaction" "cable" "polling"
## [490] "freetoplay" "fantasy" "freemium"
## [493] "sponsorship" "algorithms" "film"
## [496] "governments" "guides" "licensing"
## [499] "self" "newsletters" "professionals"
## [502] "phone" "windows" "translation"
## [505] "behavior" "proximity" "housing"
## [508] "presence" "vertical" "procurement"
## [511] "domains" "center" "mac"
## [514] "contests" "embedded" "microblogging"
## [517] "new" "teenagers" "venues"
## [520] "courier" "postal" "indoor"
## [523] "positioning" "residential" "enforcement"
## [526] "law" "soccer" "wealth"
## [529] "defense" "intelligent" "production"
## [532] "educational" "advice" "speech"
## [535] "usability" "demographies" "kinect"
## [538] "diy" "humanitarian" "emerging"
## [541] "telephony" "matchmaking" "visual"
## [544] "expanding" "rapidly" "families"
## [547] "material" "science"
findAssocs(dtm, terms = "health", corlimit = 0.3)
## $health
## care wellness
## 0.72 0.63
#Plot word frequencies
barplot(d[1:10,]$freq, las = 2, names.arg = d[1:10,]$word,
col ="lightblue", main ="Most frequent words",
ylab = "Word frequencies")
dat %>%
group_by(post_success) %>%
top_n(10, market)
d = data.frame(dat['market'],dat['post_success'],dat['status'],dat['founded_year'])
d = subset(d, dat$market == "Software"|dat$market == "Biotechnology"|dat$market == "Mobile"|dat$market == "Enterprise Software"|dat$market == "Curated Web"|dat$market == "Health Care"|dat$market == "E-Commerce"|dat$market == "Hardware + Software"|dat$market == "Advertising"|dat$market == "Health and Wellness")
d
ggplot(data = d, aes(x=market, fill = post_success)) +
theme_classic() +
ggtitle("Distribution of Success Among Top 10 Markets") +
xlab("Market") +
ylab("Number of Companies") +
geom_bar(position="dodge") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
We can observe that the startups in the field of software, advertising and biotechnology have more success rate when compared to others.
ggplot(data=dat, aes(x=post_success, y=venture, fill = post_success)) +
theme_classic() +
geom_boxplot() +
ggtitle("Distribution of Venture Capital Grouped by Success") +
xlab("Success") +
ylab("Venture Capital")
We can observe that successful companies are backed by Venture capital. Even top companies like facebook, youtube, google are backed by venture capital in the initial stages.
ggplot(data = d, aes(x=market, fill = status)) +
theme_classic() +
ggtitle("Distribution of Current StartUps Status Among Top 10 Markets") +
xlab("Market") +
ylab("Number of Companies") +
geom_bar(position="dodge") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
d1 = subset(d, d$founded_year == 2000:2014)
## Warning in d$founded_year == 2000:2014: longer object length is not a multiple
## of shorter object length
d1
ggplot(data = d1, aes(x=market, fill = founded_year)) +
theme_classic() +
ggtitle("Distribution of StartUps Among Top 10 Markets") +
xlab("Market") +
ylab("Number of Companies") +
geom_bar(position="dodge") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
ggplot(data = d1, aes(x=founded_year, fill = market)) +
theme_classic() +
ggtitle("Distribution of StartUps Success Among Top 10 Markets") +
xlab("Founded Year") +
ylab("Number of Companies") +
geom_bar(position="dodge") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
We can observe that more number of successful startups are in the field of software and biotechnology
ggplot(data = d1, aes(x=founded_year, fill = post_success)) +
theme_classic() +
ggtitle("Distribution of Success Among Startups Founded after 2000") +
xlab("Founded Year") +
ylab("Number of Companies") +
geom_bar(position="dodge") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
We can observe that the statups which have started before 2007 have good success rate. Between 2008-2010, the success rate of startups has reduced. After 2010, the success rate of startups increased.
d2 = subset(dat, dat$founded_year == 2000:2014)
## Warning in dat$founded_year == 2000:2014: longer object length is not a multiple
## of shorter object length
ggplot(data = d2, aes(x=founded_year, fill = status)) +
theme_classic() +
ggtitle("Distribution of Current StartUps Status Among Founded Year") +
xlab("Founded Year") +
ylab("Number of Companies") +
geom_bar(position="dodge") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
We can observe that every year few successful companies are acquired, whereas as few are closed and the remaining stay operated.
datCorr <- dat
datCorr$name <- NULL
datCorr$category_list <- NULL
datCorr$market <- NULL
datCorr$status <- NULL
datCorr$state_code <- NULL
datCorr$region <- NULL
datCorr$city <- NULL
datCorr$founded_quarter <- NULL
datCorr$post_success <- as.numeric(as.character(datCorr$post_success))
str(datCorr)
## 'data.frame': 18280 obs. of 33 variables:
## $ funding_total_usd : num 1750000 1750000 2050000 40000 2535000 ...
## $ funding_rounds : num 1 1 4 1 2 6 1 2 1 1 ...
## $ founded_at : num 1 1 1 1 1 1 4 16 1 1 ...
## $ founded_month : num 6 1 1 1 7 1 7 9 1 4 ...
## $ founded_year : num 2012 1984 2001 2008 2010 ...
## $ seed : num 1750000 0 0 40000 15000 0 420000 750000 0 50000 ...
## $ venture : num 0 0 0 0 2520000 ...
## $ equity_crowdfunding : num 0 0 0 0 0 0 0 0 0 0 ...
## $ undisclosed : num 0 0 0 0 0 0 0 0 0 0 ...
## $ convertible_note : num 0 1750000 0 0 0 0 0 0 0 0 ...
## $ debt_financing : num 0 0 2050000 0 0 ...
## $ angel : num 0 0 0 0 0 0 0 0 0 0 ...
## $ grant : num 0 0 0 0 0 0 0 0 0 0 ...
## $ private_equity : num 0 0 0 0 0 0 0 0 0 0 ...
## $ post_ipo_equity : num 0 0 0 0 0 0 0 0 0 0 ...
## $ post_ipo_debt : num 0 0 0 0 0 0 0 0 0 0 ...
## $ secondary_market : num 0 0 0 0 0 0 0 0 0 0 ...
## $ product_crowdfunding: num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_A : num 0 0 0 0 2520000 0 0 0 0 0 ...
## $ round_B : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_C : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_D : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_E : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_F : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_G : num 0 0 0 0 0 0 0 0 0 0 ...
## $ round_H : num 0 0 0 0 0 0 0 0 0 0 ...
## $ first_funding_day : int 30 2 18 23 1 14 26 2 8 1 ...
## $ first_funding_month : num 6 3 6 8 1 10 11 11 3 4 ...
## $ first_funding_year : num 2012 2011 2009 2011 2010 ...
## $ last_funding_day : int 30 2 28 23 16 19 26 30 8 1 ...
## $ last_funding_month : num 6 3 12 8 2 9 11 11 3 4 ...
## $ last_funding_year : num 2012 2011 2011 2011 2011 ...
## $ post_success : num 1 0 0 0 1 0 0 1 0 0 ...
We removed columns which are of type “character”.
chart.Correlation(datCorr,
histogram=TRUE,
method = 'pearson',
pch=19)
rcorr(as.matrix(datCorr))
## funding_total_usd funding_rounds founded_at founded_month
## funding_total_usd 1.00 0.10 -0.01 0.00
## funding_rounds 0.10 1.00 -0.05 -0.03
## founded_at -0.01 -0.05 1.00 0.32
## founded_month 0.00 -0.03 0.32 1.00
## founded_year -0.06 -0.05 0.12 0.19
## seed 0.00 0.07 0.03 0.06
## venture 0.18 0.43 -0.05 -0.04
## equity_crowdfunding 0.00 -0.01 0.02 0.01
## undisclosed 0.01 0.04 -0.01 -0.02
## convertible_note 0.01 0.01 0.00 -0.01
## debt_financing 0.94 0.02 0.00 0.01
## angel 0.00 0.06 0.03 0.06
## grant 0.02 0.02 -0.01 -0.02
## private_equity 0.21 0.06 -0.02 -0.03
## post_ipo_equity 0.19 0.01 0.00 0.01
## post_ipo_debt 0.14 0.00 -0.01 0.00
## secondary_market 0.04 0.03 0.02 0.01
## product_crowdfunding 0.00 0.02 0.01 0.00
## round_A 0.05 0.18 -0.02 0.00
## round_B 0.10 0.26 -0.04 -0.02
## round_C 0.13 0.32 -0.03 -0.01
## round_D 0.12 0.20 -0.01 0.00
## round_E 0.09 0.23 -0.02 -0.01
## round_F 0.08 0.10 0.01 0.01
## round_G 0.04 0.07 0.02 0.00
## round_H 0.02 0.04 0.00 0.00
## first_funding_day 0.01 -0.06 0.00 -0.11
## first_funding_month -0.01 -0.03 0.00 0.03
## first_funding_year -0.01 -0.07 0.02 0.02
## last_funding_day 0.02 0.05 -0.01 -0.08
## last_funding_month -0.01 0.05 0.00 0.02
## last_funding_year 0.02 0.22 0.03 0.03
## post_success 0.02 0.06 -0.03 0.00
## founded_year seed venture equity_crowdfunding undisclosed
## funding_total_usd -0.06 0.00 0.18 0.00 0.01
## funding_rounds -0.05 0.07 0.43 -0.01 0.04
## founded_at 0.12 0.03 -0.05 0.02 -0.01
## founded_month 0.19 0.06 -0.04 0.01 -0.02
## founded_year 1.00 0.10 -0.09 -0.01 -0.03
## seed 0.10 1.00 -0.02 -0.01 -0.01
## venture -0.09 -0.02 1.00 -0.01 0.01
## equity_crowdfunding -0.01 -0.01 -0.01 1.00 0.00
## undisclosed -0.03 -0.01 0.01 0.00 1.00
## convertible_note -0.01 0.00 0.00 0.00 0.00
## debt_financing -0.03 0.00 0.01 0.00 0.00
## angel 0.03 -0.01 0.01 0.00 0.00
## grant -0.06 -0.01 0.02 0.00 0.00
## private_equity -0.06 -0.01 0.07 0.00 0.01
## post_ipo_equity -0.05 -0.01 0.00 0.00 0.00
## post_ipo_debt -0.04 0.00 0.00 0.00 0.00
## secondary_market -0.01 0.00 0.15 0.00 0.00
## product_crowdfunding 0.00 0.30 0.00 0.02 0.00
## round_A 0.01 0.02 0.34 -0.01 0.00
## round_B -0.03 0.00 0.54 -0.01 0.00
## round_C -0.04 -0.01 0.65 -0.01 0.01
## round_D -0.03 -0.02 0.66 0.00 0.00
## round_E -0.03 -0.02 0.48 0.00 0.00
## round_F -0.01 -0.01 0.41 0.00 0.00
## round_G -0.01 0.00 0.19 0.00 0.00
## round_H -0.01 0.00 0.07 0.00 0.00
## first_funding_day -0.05 -0.04 -0.03 0.02 0.01
## first_funding_month -0.02 0.00 -0.02 0.00 0.01
## first_funding_year 0.07 0.02 -0.05 0.01 -0.01
## last_funding_day -0.03 0.00 0.03 0.01 0.02
## last_funding_month 0.01 0.04 0.02 0.01 0.00
## last_funding_year 0.29 0.11 0.03 0.03 -0.05
## post_success -0.13 -0.03 0.05 0.00 0.01
## convertible_note debt_financing angel grant private_equity
## funding_total_usd 0.01 0.94 0.00 0.02 0.21
## funding_rounds 0.01 0.02 0.06 0.02 0.06
## founded_at 0.00 0.00 0.03 -0.01 -0.02
## founded_month -0.01 0.01 0.06 -0.02 -0.03
## founded_year -0.01 -0.03 0.03 -0.06 -0.06
## seed 0.00 0.00 -0.01 -0.01 -0.01
## venture 0.00 0.01 0.01 0.02 0.07
## equity_crowdfunding 0.00 0.00 0.00 0.00 0.00
## undisclosed 0.00 0.00 0.00 0.00 0.01
## convertible_note 1.00 0.00 0.00 0.00 0.00
## debt_financing 0.00 1.00 0.00 0.00 0.01
## angel 0.00 0.00 1.00 -0.01 0.00
## grant 0.00 0.00 -0.01 1.00 0.01
## private_equity 0.00 0.01 0.00 0.01 1.00
## post_ipo_equity 0.00 0.00 0.00 0.00 0.01
## post_ipo_debt 0.00 0.00 0.00 0.00 0.00
## secondary_market 0.00 0.00 0.02 0.00 0.00
## product_crowdfunding 0.00 0.00 0.00 0.00 0.00
## round_A 0.00 0.00 0.01 0.01 0.00
## round_B 0.00 0.01 0.00 0.01 0.02
## round_C 0.00 0.01 0.01 0.02 0.10
## round_D 0.00 0.00 0.02 0.01 0.05
## round_E 0.00 0.01 0.00 0.03 0.04
## round_F 0.00 0.01 0.00 0.00 0.02
## round_G 0.00 0.00 0.00 0.00 0.02
## round_H 0.00 0.00 0.00 0.00 0.02
## first_funding_day 0.00 0.01 -0.06 0.00 0.00
## first_funding_month 0.01 -0.01 -0.03 0.02 0.01
## first_funding_year 0.00 0.00 -0.01 0.00 -0.01
## last_funding_day 0.00 0.01 -0.03 0.00 0.01
## last_funding_month 0.01 -0.01 -0.02 0.02 -0.01
## last_funding_year 0.01 0.01 -0.01 0.00 0.02
## post_success -0.01 0.00 0.00 -0.01 0.00
## post_ipo_equity post_ipo_debt secondary_market
## funding_total_usd 0.19 0.14 0.04
## funding_rounds 0.01 0.00 0.03
## founded_at 0.00 -0.01 0.02
## founded_month 0.01 0.00 0.01
## founded_year -0.05 -0.04 -0.01
## seed -0.01 0.00 0.00
## venture 0.00 0.00 0.15
## equity_crowdfunding 0.00 0.00 0.00
## undisclosed 0.00 0.00 0.00
## convertible_note 0.00 0.00 0.00
## debt_financing 0.00 0.00 0.00
## angel 0.00 0.00 0.02
## grant 0.00 0.00 0.00
## private_equity 0.01 0.00 0.00
## post_ipo_equity 1.00 0.28 0.00
## post_ipo_debt 0.28 1.00 0.00
## secondary_market 0.00 0.00 1.00
## product_crowdfunding 0.00 0.00 0.00
## round_A 0.00 0.00 0.01
## round_B 0.00 0.00 0.03
## round_C 0.00 0.00 0.05
## round_D 0.00 0.00 0.05
## round_E 0.00 0.00 0.01
## round_F 0.01 0.00 0.13
## round_G 0.00 0.00 0.72
## round_H 0.00 0.00 0.00
## first_funding_day 0.00 0.01 -0.01
## first_funding_month 0.00 0.00 0.00
## first_funding_year 0.00 0.00 0.00
## last_funding_day 0.01 0.01 -0.01
## last_funding_month -0.01 0.00 0.00
## last_funding_year 0.01 0.01 0.00
## post_success 0.06 0.00 0.00
## product_crowdfunding round_A round_B round_C round_D
## funding_total_usd 0.00 0.05 0.10 0.13 0.12
## funding_rounds 0.02 0.18 0.26 0.32 0.20
## founded_at 0.01 -0.02 -0.04 -0.03 -0.01
## founded_month 0.00 0.00 -0.02 -0.01 0.00
## founded_year 0.00 0.01 -0.03 -0.04 -0.03
## seed 0.30 0.02 0.00 -0.01 -0.02
## venture 0.00 0.34 0.54 0.65 0.66
## equity_crowdfunding 0.02 -0.01 -0.01 -0.01 0.00
## undisclosed 0.00 0.00 0.00 0.01 0.00
## convertible_note 0.00 0.00 0.00 0.00 0.00
## debt_financing 0.00 0.00 0.01 0.01 0.00
## angel 0.00 0.01 0.00 0.01 0.02
## grant 0.00 0.01 0.01 0.02 0.01
## private_equity 0.00 0.00 0.02 0.10 0.05
## post_ipo_equity 0.00 0.00 0.00 0.00 0.00
## post_ipo_debt 0.00 0.00 0.00 0.00 0.00
## secondary_market 0.00 0.01 0.03 0.05 0.05
## product_crowdfunding 1.00 0.00 0.00 0.00 0.00
## round_A 0.00 1.00 0.35 0.14 0.04
## round_B 0.00 0.35 1.00 0.39 0.12
## round_C 0.00 0.14 0.39 1.00 0.40
## round_D 0.00 0.04 0.12 0.40 1.00
## round_E 0.00 0.04 0.10 0.15 0.23
## round_F 0.00 0.01 0.03 0.05 0.09
## round_G 0.00 0.00 0.01 0.02 0.05
## round_H 0.00 0.00 0.00 0.02 0.02
## first_funding_day 0.00 -0.02 -0.02 -0.02 -0.02
## first_funding_month 0.00 0.00 -0.01 -0.01 -0.01
## first_funding_year 0.00 -0.02 -0.03 -0.04 -0.02
## last_funding_day -0.01 0.01 0.03 0.02 0.01
## last_funding_month 0.01 0.02 0.02 0.01 0.00
## last_funding_year 0.01 0.02 0.01 0.02 0.02
## post_success -0.01 0.05 0.07 0.05 0.02
## round_E round_F round_G round_H first_funding_day
## funding_total_usd 0.09 0.08 0.04 0.02 0.01
## funding_rounds 0.23 0.10 0.07 0.04 -0.06
## founded_at -0.02 0.01 0.02 0.00 0.00
## founded_month -0.01 0.01 0.00 0.00 -0.11
## founded_year -0.03 -0.01 -0.01 -0.01 -0.05
## seed -0.02 -0.01 0.00 0.00 -0.04
## venture 0.48 0.41 0.19 0.07 -0.03
## equity_crowdfunding 0.00 0.00 0.00 0.00 0.02
## undisclosed 0.00 0.00 0.00 0.00 0.01
## convertible_note 0.00 0.00 0.00 0.00 0.00
## debt_financing 0.01 0.01 0.00 0.00 0.01
## angel 0.00 0.00 0.00 0.00 -0.06
## grant 0.03 0.00 0.00 0.00 0.00
## private_equity 0.04 0.02 0.02 0.02 0.00
## post_ipo_equity 0.00 0.01 0.00 0.00 0.00
## post_ipo_debt 0.00 0.00 0.00 0.00 0.01
## secondary_market 0.01 0.13 0.72 0.00 -0.01
## product_crowdfunding 0.00 0.00 0.00 0.00 0.00
## round_A 0.04 0.01 0.00 0.00 -0.02
## round_B 0.10 0.03 0.01 0.00 -0.02
## round_C 0.15 0.05 0.02 0.02 -0.02
## round_D 0.23 0.09 0.05 0.02 -0.02
## round_E 1.00 0.30 0.03 0.10 -0.02
## round_F 0.30 1.00 0.21 0.03 -0.01
## round_G 0.03 0.21 1.00 0.15 -0.01
## round_H 0.10 0.03 0.15 1.00 -0.01
## first_funding_day -0.02 -0.01 -0.01 -0.01 1.00
## first_funding_month -0.02 -0.01 0.00 -0.01 0.10
## first_funding_year -0.02 -0.01 -0.01 0.00 0.02
## last_funding_day 0.00 -0.01 0.00 0.01 0.51
## last_funding_month 0.01 0.01 0.01 0.00 0.05
## last_funding_year 0.03 0.01 0.01 0.01 0.09
## post_success 0.00 0.00 -0.01 0.00 -0.05
## first_funding_month first_funding_year last_funding_day
## funding_total_usd -0.01 -0.01 0.02
## funding_rounds -0.03 -0.07 0.05
## founded_at 0.00 0.02 -0.01
## founded_month 0.03 0.02 -0.08
## founded_year -0.02 0.07 -0.03
## seed 0.00 0.02 0.00
## venture -0.02 -0.05 0.03
## equity_crowdfunding 0.00 0.01 0.01
## undisclosed 0.01 -0.01 0.02
## convertible_note 0.01 0.00 0.00
## debt_financing -0.01 0.00 0.01
## angel -0.03 -0.01 -0.03
## grant 0.02 0.00 0.00
## private_equity 0.01 -0.01 0.01
## post_ipo_equity 0.00 0.00 0.01
## post_ipo_debt 0.00 0.00 0.01
## secondary_market 0.00 0.00 -0.01
## product_crowdfunding 0.00 0.00 -0.01
## round_A 0.00 -0.02 0.01
## round_B -0.01 -0.03 0.03
## round_C -0.01 -0.04 0.02
## round_D -0.01 -0.02 0.01
## round_E -0.02 -0.02 0.00
## round_F -0.01 -0.01 -0.01
## round_G 0.00 -0.01 0.00
## round_H -0.01 0.00 0.01
## first_funding_day 0.10 0.02 0.51
## first_funding_month 1.00 -0.01 0.05
## first_funding_year -0.01 1.00 0.00
## last_funding_day 0.05 0.00 1.00
## last_funding_month 0.47 0.00 0.05
## last_funding_year 0.02 0.13 0.09
## post_success -0.02 -0.08 0.00
## last_funding_month last_funding_year post_success
## funding_total_usd -0.01 0.02 0.02
## funding_rounds 0.05 0.22 0.06
## founded_at 0.00 0.03 -0.03
## founded_month 0.02 0.03 0.00
## founded_year 0.01 0.29 -0.13
## seed 0.04 0.11 -0.03
## venture 0.02 0.03 0.05
## equity_crowdfunding 0.01 0.03 0.00
## undisclosed 0.00 -0.05 0.01
## convertible_note 0.01 0.01 -0.01
## debt_financing -0.01 0.01 0.00
## angel -0.02 -0.01 0.00
## grant 0.02 0.00 -0.01
## private_equity -0.01 0.02 0.00
## post_ipo_equity -0.01 0.01 0.06
## post_ipo_debt 0.00 0.01 0.00
## secondary_market 0.00 0.00 0.00
## product_crowdfunding 0.01 0.01 -0.01
## round_A 0.02 0.02 0.05
## round_B 0.02 0.01 0.07
## round_C 0.01 0.02 0.05
## round_D 0.00 0.02 0.02
## round_E 0.01 0.03 0.00
## round_F 0.01 0.01 0.00
## round_G 0.01 0.01 -0.01
## round_H 0.00 0.01 0.00
## first_funding_day 0.05 0.09 -0.05
## first_funding_month 0.47 0.02 -0.02
## first_funding_year 0.00 0.13 -0.08
## last_funding_day 0.05 0.09 0.00
## last_funding_month 1.00 0.00 -0.03
## last_funding_year 0.00 1.00 -0.29
## post_success -0.03 -0.29 1.00
##
## n= 18280
##
##
## P
## funding_total_usd funding_rounds founded_at founded_month
## funding_total_usd 0.0000 0.3063 0.8025
## funding_rounds 0.0000 0.0000 0.0000
## founded_at 0.3063 0.0000 0.0000
## founded_month 0.8025 0.0000 0.0000
## founded_year 0.0000 0.0000 0.0000 0.0000
## seed 0.5354 0.0000 0.0000 0.0000
## venture 0.0000 0.0000 0.0000 0.0000
## equity_crowdfunding 0.7812 0.0637 0.0383 0.2296
## undisclosed 0.2515 0.0000 0.1100 0.0121
## convertible_note 0.1903 0.2619 0.7058 0.3258
## debt_financing 0.0000 0.0048 0.6278 0.0814
## angel 0.9824 0.0000 0.0001 0.0000
## grant 0.0012 0.0020 0.1314 0.0314
## private_equity 0.0000 0.0000 0.0220 0.0001
## post_ipo_equity 0.0000 0.0629 0.9825 0.2205
## post_ipo_debt 0.0000 0.6161 0.4372 0.6538
## secondary_market 0.0000 0.0000 0.0252 0.1108
## product_crowdfunding 0.6686 0.0212 0.0520 0.7515
## round_A 0.0000 0.0000 0.0007 0.5116
## round_B 0.0000 0.0000 0.0000 0.0041
## round_C 0.0000 0.0000 0.0003 0.3393
## round_D 0.0000 0.0000 0.0620 0.6386
## round_E 0.0000 0.0000 0.0073 0.0542
## round_F 0.0000 0.0000 0.2305 0.1833
## round_G 0.0000 0.0000 0.0017 0.7038
## round_H 0.0371 0.0000 0.7308 0.5442
## first_funding_day 0.3900 0.0000 0.7679 0.0000
## first_funding_month 0.1101 0.0000 0.6575 0.0000
## first_funding_year 0.2279 0.0000 0.0193 0.0170
## last_funding_day 0.0059 0.0000 0.4903 0.0000
## last_funding_month 0.2994 0.0000 0.8751 0.0163
## last_funding_year 0.0059 0.0000 0.0000 0.0004
## post_success 0.0230 0.0000 0.0002 0.6359
## founded_year seed venture equity_crowdfunding
## funding_total_usd 0.0000 0.5354 0.0000 0.7812
## funding_rounds 0.0000 0.0000 0.0000 0.0637
## founded_at 0.0000 0.0000 0.0000 0.0383
## founded_month 0.0000 0.0000 0.0000 0.2296
## founded_year 0.0000 0.0000 0.3379
## seed 0.0000 0.0014 0.3548
## venture 0.0000 0.0014 0.1475
## equity_crowdfunding 0.3379 0.3548 0.1475
## undisclosed 0.0005 0.3213 0.2219 0.8540
## convertible_note 0.1140 0.6589 0.8794 0.9264
## debt_financing 0.0000 0.6283 0.1895 0.9416
## angel 0.0000 0.2233 0.4532 0.7298
## grant 0.0000 0.2530 0.0011 0.8523
## private_equity 0.0000 0.1565 0.0000 0.7621
## post_ipo_equity 0.0000 0.4569 0.7702 0.9219
## post_ipo_debt 0.0000 0.5446 0.6100 0.9372
## secondary_market 0.4757 0.6288 0.0000 0.9467
## product_crowdfunding 0.8910 0.0000 0.5434 0.0259
## round_A 0.4624 0.0059 0.0000 0.3979
## round_B 0.0005 0.8613 0.0000 0.2775
## round_C 0.0000 0.3323 0.0000 0.3349
## round_D 0.0004 0.0221 0.0000 0.6397
## round_E 0.0000 0.0043 0.0000 0.6631
## round_F 0.1364 0.2131 0.0000 0.8662
## round_G 0.2457 0.5798 0.0000 0.9322
## round_H 0.3829 0.7888 0.0000 0.9723
## first_funding_day 0.0000 0.0000 0.0005 0.0384
## first_funding_month 0.0326 0.5971 0.0292 0.9527
## first_funding_year 0.0000 0.0205 0.0000 0.3462
## last_funding_day 0.0000 0.5420 0.0000 0.1359
## last_funding_month 0.3253 0.0000 0.0030 0.4671
## last_funding_year 0.0000 0.0000 0.0000 0.0000
## post_success 0.0000 0.0000 0.0000 0.5413
## undisclosed convertible_note debt_financing angel grant
## funding_total_usd 0.2515 0.1903 0.0000 0.9824 0.0012
## funding_rounds 0.0000 0.2619 0.0048 0.0000 0.0020
## founded_at 0.1100 0.7058 0.6278 0.0001 0.1314
## founded_month 0.0121 0.3258 0.0814 0.0000 0.0314
## founded_year 0.0005 0.1140 0.0000 0.0000 0.0000
## seed 0.3213 0.6589 0.6283 0.2233 0.2530
## venture 0.2219 0.8794 0.1895 0.4532 0.0011
## equity_crowdfunding 0.8540 0.9264 0.9416 0.7298 0.8523
## undisclosed 0.9245 0.9896 0.6083 0.8316
## convertible_note 0.9245 0.9710 0.7834 0.9353
## debt_financing 0.9896 0.9710 0.7960 0.9485
## angel 0.6083 0.7834 0.7960 0.4684
## grant 0.8316 0.9353 0.9485 0.4684
## private_equity 0.3618 0.6058 0.3347 0.7788 0.1679
## post_ipo_equity 0.9175 0.9611 0.9898 0.6662 0.7737
## post_ipo_debt 0.9289 0.9640 0.9728 0.7286 0.9293
## secondary_market 0.9397 0.9695 0.9828 0.0383 0.9383
## product_crowdfunding 0.9255 0.9690 0.9698 0.7864 0.7963
## round_A 0.9829 0.6354 0.8743 0.2120 0.3458
## round_B 0.8658 0.9170 0.3391 0.7631 0.4299
## round_C 0.4660 0.8906 0.2019 0.0845 0.0020
## round_D 0.7980 0.9844 0.6399 0.0026 0.2179
## round_E 0.6411 0.9394 0.3023 0.8804 0.0002
## round_F 0.8986 0.9338 0.4277 0.5910 0.8996
## round_G 0.9936 0.9868 0.9497 0.7079 0.9215
## round_H 0.9687 0.9842 0.9257 0.8787 0.9680
## first_funding_day 0.3200 0.5122 0.1194 0.0000 0.8453
## first_funding_month 0.1778 0.4178 0.1320 0.0002 0.0082
## first_funding_year 0.0857 0.8457 0.9621 0.4182 0.9809
## last_funding_day 0.0016 0.5126 0.1029 0.0000 0.8216
## last_funding_month 0.7856 0.3857 0.1962 0.0123 0.0162
## last_funding_year 0.0000 0.1213 0.1920 0.0653 0.5062
## post_success 0.2209 0.4312 0.7988 0.9547 0.2661
## private_equity post_ipo_equity post_ipo_debt
## funding_total_usd 0.0000 0.0000 0.0000
## funding_rounds 0.0000 0.0629 0.6161
## founded_at 0.0220 0.9825 0.4372
## founded_month 0.0001 0.2205 0.6538
## founded_year 0.0000 0.0000 0.0000
## seed 0.1565 0.4569 0.5446
## venture 0.0000 0.7702 0.6100
## equity_crowdfunding 0.7621 0.9219 0.9372
## undisclosed 0.3618 0.9175 0.9289
## convertible_note 0.6058 0.9611 0.9640
## debt_financing 0.3347 0.9898 0.9728
## angel 0.7788 0.6662 0.7286
## grant 0.1679 0.7737 0.9293
## private_equity 0.0834 0.6523
## post_ipo_equity 0.0834 0.0000
## post_ipo_debt 0.6523 0.0000
## secondary_market 0.9009 0.9690 0.9742
## product_crowdfunding 0.8759 0.9597 0.9676
## round_A 0.6227 0.8223 0.5045
## round_B 0.0110 0.7715 0.6941
## round_C 0.0000 0.9425 0.8381
## round_D 0.0000 0.8083 0.9078
## round_E 0.0000 0.9681 0.8848
## round_F 0.0207 0.4168 0.9349
## round_G 0.0190 0.9590 0.9671
## round_H 0.0031 0.9833 0.9866
## first_funding_day 0.8233 0.6144 0.2369
## first_funding_month 0.3361 0.5219 0.7951
## first_funding_year 0.3327 0.8417 0.7353
## last_funding_day 0.2706 0.1398 0.1006
## last_funding_month 0.4473 0.1672 0.9865
## last_funding_year 0.0137 0.3962 0.0594
## post_success 0.5921 0.0000 0.5306
## secondary_market product_crowdfunding round_A round_B
## funding_total_usd 0.0000 0.6686 0.0000 0.0000
## funding_rounds 0.0000 0.0212 0.0000 0.0000
## founded_at 0.0252 0.0520 0.0007 0.0000
## founded_month 0.1108 0.7515 0.5116 0.0041
## founded_year 0.4757 0.8910 0.4624 0.0005
## seed 0.6288 0.0000 0.0059 0.8613
## venture 0.0000 0.5434 0.0000 0.0000
## equity_crowdfunding 0.9467 0.0259 0.3979 0.2775
## undisclosed 0.9397 0.9255 0.9829 0.8658
## convertible_note 0.9695 0.9690 0.6354 0.9170
## debt_financing 0.9828 0.9698 0.8743 0.3391
## angel 0.0383 0.7864 0.2120 0.7631
## grant 0.9383 0.7963 0.3458 0.4299
## private_equity 0.9009 0.8759 0.6227 0.0110
## post_ipo_equity 0.9690 0.9597 0.8223 0.7715
## post_ipo_debt 0.9742 0.9676 0.5045 0.6941
## secondary_market 0.9725 0.3856 0.0000
## product_crowdfunding 0.9725 0.9214 0.6850
## round_A 0.3856 0.9214 0.0000
## round_B 0.0000 0.6850 0.0000
## round_C 0.0000 0.6189 0.0000 0.0000
## round_D 0.0000 0.8092 0.0000 0.0000
## round_E 0.4499 0.8222 0.0000 0.0000
## round_F 0.0000 0.9307 0.3047 0.0000
## round_G 0.0000 0.9650 0.9091 0.2902
## round_H 0.9886 0.9857 0.7602 0.8049
## first_funding_day 0.3679 0.5686 0.0031 0.0019
## first_funding_month 0.7326 0.8361 0.5139 0.1313
## first_funding_year 0.7720 0.8700 0.0104 0.0000
## last_funding_day 0.2287 0.4176 0.0703 0.0000
## last_funding_month 0.8479 0.0517 0.0018 0.0014
## last_funding_year 0.6149 0.2207 0.0338 0.2013
## post_success 0.5257 0.4366 0.0000 0.0000
## round_C round_D round_E round_F round_G round_H
## funding_total_usd 0.0000 0.0000 0.0000 0.0000 0.0000 0.0371
## funding_rounds 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
## founded_at 0.0003 0.0620 0.0073 0.2305 0.0017 0.7308
## founded_month 0.3393 0.6386 0.0542 0.1833 0.7038 0.5442
## founded_year 0.0000 0.0004 0.0000 0.1364 0.2457 0.3829
## seed 0.3323 0.0221 0.0043 0.2131 0.5798 0.7888
## venture 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
## equity_crowdfunding 0.3349 0.6397 0.6631 0.8662 0.9322 0.9723
## undisclosed 0.4660 0.7980 0.6411 0.8986 0.9936 0.9687
## convertible_note 0.8906 0.9844 0.9394 0.9338 0.9868 0.9842
## debt_financing 0.2019 0.6399 0.3023 0.4277 0.9497 0.9257
## angel 0.0845 0.0026 0.8804 0.5910 0.7079 0.8787
## grant 0.0020 0.2179 0.0002 0.8996 0.9215 0.9680
## private_equity 0.0000 0.0000 0.0000 0.0207 0.0190 0.0031
## post_ipo_equity 0.9425 0.8083 0.9681 0.4168 0.9590 0.9833
## post_ipo_debt 0.8381 0.9078 0.8848 0.9349 0.9671 0.9866
## secondary_market 0.0000 0.0000 0.4499 0.0000 0.0000 0.9886
## product_crowdfunding 0.6189 0.8092 0.8222 0.9307 0.9650 0.9857
## round_A 0.0000 0.0000 0.0000 0.3047 0.9091 0.7602
## round_B 0.0000 0.0000 0.0000 0.0000 0.2902 0.8049
## round_C 0.0000 0.0000 0.0000 0.0012 0.0328
## round_D 0.0000 0.0000 0.0000 0.0000 0.0414
## round_E 0.0000 0.0000 0.0000 0.0000 0.0000
## round_F 0.0000 0.0000 0.0000 0.0000 0.0000
## round_G 0.0012 0.0000 0.0000 0.0000 0.0000
## round_H 0.0328 0.0414 0.0000 0.0000 0.0000
## first_funding_day 0.0012 0.0032 0.0200 0.1789 0.0468 0.2483
## first_funding_month 0.2390 0.4434 0.0100 0.1702 0.6172 0.1361
## first_funding_year 0.0000 0.0059 0.0031 0.2473 0.4219 0.6748
## last_funding_day 0.0107 0.1932 0.5436 0.1914 0.8577 0.3251
## last_funding_month 0.0899 0.7467 0.3880 0.3464 0.2510 0.8652
## last_funding_year 0.0207 0.0140 0.0003 0.0615 0.4419 0.3685
## post_success 0.0000 0.0363 0.5459 0.7626 0.4709 0.7219
## first_funding_day first_funding_month first_funding_year
## funding_total_usd 0.3900 0.1101 0.2279
## funding_rounds 0.0000 0.0000 0.0000
## founded_at 0.7679 0.6575 0.0193
## founded_month 0.0000 0.0000 0.0170
## founded_year 0.0000 0.0326 0.0000
## seed 0.0000 0.5971 0.0205
## venture 0.0005 0.0292 0.0000
## equity_crowdfunding 0.0384 0.9527 0.3462
## undisclosed 0.3200 0.1778 0.0857
## convertible_note 0.5122 0.4178 0.8457
## debt_financing 0.1194 0.1320 0.9621
## angel 0.0000 0.0002 0.4182
## grant 0.8453 0.0082 0.9809
## private_equity 0.8233 0.3361 0.3327
## post_ipo_equity 0.6144 0.5219 0.8417
## post_ipo_debt 0.2369 0.7951 0.7353
## secondary_market 0.3679 0.7326 0.7720
## product_crowdfunding 0.5686 0.8361 0.8700
## round_A 0.0031 0.5139 0.0104
## round_B 0.0019 0.1313 0.0000
## round_C 0.0012 0.2390 0.0000
## round_D 0.0032 0.4434 0.0059
## round_E 0.0200 0.0100 0.0031
## round_F 0.1789 0.1702 0.2473
## round_G 0.0468 0.6172 0.4219
## round_H 0.2483 0.1361 0.6748
## first_funding_day 0.0000 0.0031
## first_funding_month 0.0000 0.1530
## first_funding_year 0.0031 0.1530
## last_funding_day 0.0000 0.0000 0.8152
## last_funding_month 0.0000 0.0000 0.6874
## last_funding_year 0.0000 0.0336 0.0000
## post_success 0.0000 0.0310 0.0000
## last_funding_day last_funding_month last_funding_year
## funding_total_usd 0.0059 0.2994 0.0059
## funding_rounds 0.0000 0.0000 0.0000
## founded_at 0.4903 0.8751 0.0000
## founded_month 0.0000 0.0163 0.0004
## founded_year 0.0000 0.3253 0.0000
## seed 0.5420 0.0000 0.0000
## venture 0.0000 0.0030 0.0000
## equity_crowdfunding 0.1359 0.4671 0.0000
## undisclosed 0.0016 0.7856 0.0000
## convertible_note 0.5126 0.3857 0.1213
## debt_financing 0.1029 0.1962 0.1920
## angel 0.0000 0.0123 0.0653
## grant 0.8216 0.0162 0.5062
## private_equity 0.2706 0.4473 0.0137
## post_ipo_equity 0.1398 0.1672 0.3962
## post_ipo_debt 0.1006 0.9865 0.0594
## secondary_market 0.2287 0.8479 0.6149
## product_crowdfunding 0.4176 0.0517 0.2207
## round_A 0.0703 0.0018 0.0338
## round_B 0.0000 0.0014 0.2013
## round_C 0.0107 0.0899 0.0207
## round_D 0.1932 0.7467 0.0140
## round_E 0.5436 0.3880 0.0003
## round_F 0.1914 0.3464 0.0615
## round_G 0.8577 0.2510 0.4419
## round_H 0.3251 0.8652 0.3685
## first_funding_day 0.0000 0.0000 0.0000
## first_funding_month 0.0000 0.0000 0.0336
## first_funding_year 0.8152 0.6874 0.0000
## last_funding_day 0.0000 0.0000
## last_funding_month 0.0000 0.9713
## last_funding_year 0.0000 0.9713
## post_success 0.7852 0.0003 0.0000
## post_success
## funding_total_usd 0.0230
## funding_rounds 0.0000
## founded_at 0.0002
## founded_month 0.6359
## founded_year 0.0000
## seed 0.0000
## venture 0.0000
## equity_crowdfunding 0.5413
## undisclosed 0.2209
## convertible_note 0.4312
## debt_financing 0.7988
## angel 0.9547
## grant 0.2661
## private_equity 0.5921
## post_ipo_equity 0.0000
## post_ipo_debt 0.5306
## secondary_market 0.5257
## product_crowdfunding 0.4366
## round_A 0.0000
## round_B 0.0000
## round_C 0.0000
## round_D 0.0363
## round_E 0.5459
## round_F 0.7626
## round_G 0.4709
## round_H 0.7219
## first_funding_day 0.0000
## first_funding_month 0.0310
## first_funding_year 0.0000
## last_funding_day 0.7852
## last_funding_month 0.0003
## last_funding_year 0.0000
## post_success
From the last matrix, we can see that funding_rounds, founded_at, founded_year, seed, venture, post_ipo_equity, round_A, round_B, round_C, first_funding_day, first_funding_year, last_funding_month, last_funding_year are all significantly associated with outcome variable post_success** (p-values < 0.01).